A Bi-Criteria Approximation Algorithm for k-Means
نویسندگان
چکیده
We consider the classical k-means clustering problem in the setting bi-criteria approximation, in which an algoithm is allowed to output βk > k clusters, and must produce a clustering with cost at most α times the to the cost of the optimal set of k clusters. We argue that this approach is natural in many settings, for which the exact number of clusters is a priori unknown, or unimportant up to a constant factor. We give new bi-criteria approximation algorithms, based on linear programming and local search, respectively, which attain a guarantee α(β) depending on the number βk of clusters that may be opened. Our gurantee α(β) is always at most 9 + ǫ and improves rapidly with β (for example: α(2) < 2.59, and α(3) < 1.4). Moreover, our algorithms have only polynomial dependence on the dimension of the input data, and so are applicable in high-dimensional settings.
منابع مشابه
A hybrid DEA-based K-means and invasive weed optimization for facility location problem
In this paper, instead of the classical approach to the multi-criteria location selection problem, a new approach was presented based on selecting a portfolio of locations. First, the indices affecting the selection of maintenance stations were collected. The K-means model was used for clustering the maintenance stations. The optimal number of clusters was calculated through the Silhou...
متن کاملA Constant-Factor Bi-Criteria Approximation Guarantee for k-means++
This paper studies the k-means++ algorithm for clustering as well as the class ofD sampling algorithms to which k-means++ belongs. It is shown that for any constant factor β > 1, selecting βk cluster centers by D sampling yields a constant-factor approximation to the optimal clustering with k centers, in expectation and without conditions on the dataset. This result extends the previously known...
متن کاملExact algorithms for solving a bi-level location–allocation problem considering customer preferences
The issue discussed in this paper is a bi-level problem in which two rivals compete in attracting customers and maximizing their profits which means that competitors competing for market share must compete in the centers that are going to be located in the near future. In this paper, a nonlinear model presented in the literature considering customer preferences is linearized. Customer behavior ...
متن کاملGreedy bi-criteria approximations for k-medians and k-means
This paper investigates the following natural greedy procedure for clustering in the bi-criterion setting: iteratively grow a set of centers, in each round adding the center from a candidate set that maximally decreases clustering cost. In the case of k-medians and k-means, the key results are as follows. • When the method considers all data points as candidate centers, then selecting O(k log(1...
متن کاملطراحی و آموزش شبکه های عصبی مصنوعی به وسیله استراتژی تکاملی با جمعیت های موازی
Application of artificial neural networks (ANN) in areas such as classification of images and audio signals shows the ability of this artificial intelligence technique for solving practical problems. Construction and training of ANNs is usually a time-consuming and hard process. A suitable neural model must be able to learn the training data and also have the generalization ability. In this pap...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2016